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Abstract 

Under categorial grammars that have pow- 
erful rules like composition, a simple 
n-word sentence can have exponentially 
many parses. Generating all parses is ineffi- 
cient and obscures whatever true semantic 
ambiguities arc in the input. This paper 
addresses the problem for a fairly general 
form of Combinatory Categorial Grammar, 
by means of an efficient, correct, and easy 
to implement normal-form parsing tech- 
nique. The parser is proved to find ex- 
actly one parse in each semantic equiv- 
alence class of allowable parses; that is, 
spurious ambiguity (as carefully defined) 
is shown to be both safely and completely 
eliminated. 

1 Introduction 

Combinatory Categorial Grammar (Steedman, 
1990), like other "flexible" categorial grammars, 
suffers from spurious ambiguity (Wittenburg, 1986). 
The non-standard constituents that are so crucial to 
CCG's analyses in ([!]), and in its account of into- 
national focus (Prevost & Steedman, 1994), remain 
available even in simpler sentences. This renders (Q) 
syntactically ambiguous. 

(1) a. Coordination: [[ John likes ] ^; MP , and 

[Mary pretends to like]g/ NP ], the big 
galoot in the corner, 
b. Extraction: Everybody at this party 
[whom [ John likcs jqmp ] is a big galoot. 

(2) a. John [likes Mary] S \ NP . 
b. [ John likes ] q /mp Mary. 

The practical problem of "extra" parses in (||) be- 
comes exponentially worse for longer strings, which 
can have up to a Catalan number of parses. An 



exhaustive parser serves up 252 CCG parses of (pj), 
which must be sifted through, at considerable cost, 
in order to identify the two distinct meanings for 
further processing.^ 

(3) the galoot in the corner 

NP/N N (N\N)/NP NP/N N 

that I said Mary 

(N\N)/(S/NP) S/(S\NP) (S\NP)/S S/(S\NP) 

pretends to 

(S\NP)/(S inf \NP) (S inf \NP)/(S stcm \NP) 

like 

(Sstom\NP)/NP 

This paper presents a simple and flexible CCG 
parsing technique that prevents any such explosion 
of redundant CCG derivations. In particular, it is 



proved in §4.2 that the method constructs exactly 
one syntactic structure per semantic reading — e.g., 
just two parses for (||). All other parses are sup- 
pressed by simple normal-form constraints that are 
enforced throughout the parsing process. This ap- 
proach works because CCG's spurious ambiguities 
arise (as is shown) in only a small set of circum- 
stances. Although similar work has been attempted 
in the past, with varying degrees of success (Kart- 
tunen, 1986; Wittenburg, 1986; Pareschi & Steed- 
man, 1987; Bouma, 1989; Hepple & Morrill, 1989; 
Konig, 1989; Vijay-Shanker & Weir, 1990; Hepple, 
1990; Moortgat, 1990; Hendriks, 1993; Niv, 1994), 
this appears to be the first full normal-form result 
for a categorial formalism having more than context- 
free power. 

2 Definitions and Related Work 

CCG may be regarded as a generalization of context- 
free grammar (CFG) — one where a grammar has 
infinitely many nonterminals and phrase-structure 
rules. In addition to the familiar atomic nonter- 
minal categories (typically S for sentences, N for 



*This material is based upon work supported under 
a National Science Foundation Graduate Fellowship. I 
have been grateful for the advice of Aravind Joshi, Nobo 
Komagata, Seth Kulick, Michael Niv, Mark Steedman, 
and three anonymous reviewers. 



1 Namely, Mary pretends to like the galoot in 168 
parses and the corner in 84. One might try a statis- 
tical approach to ambiguity resolution, discarding the 
low-probability parses, but it is unclear how to model 
and train any probabilities when no single parse can be 
taken as the standard of correctness. 



nouns, NP for noun phrases, etc.), CCG allows in- 
finitely many slashed categories. If x and y are 
categories, then x/y (respectively x\y) is the cat- 
egory of an incomplete x that is missing a y at its 
right (respectively left). Thus verb phrases are an- 
alyzed as subjectless sentences S\NP, while "John 
likes" is an objectless sentence or S/NP. A complex 
category like ( (S\NP) \ (S\NP) ) /N may be written as 
S\NP\(S\NP)/N, under a convention that slashes are 
left-associative. 

The results herein apply to the TAG-equivalent 
CCG formalization given in (Joshi et al., 1991) 
In this variety of CCG, every (non-lexical) phrase- 
structure rule is an instance of one of the following 
binary-rule templates (where n > 0): 

(4) Forward generalized composition >Bn: 

x/y y \n z n ' ' ' 1 2 %2 1 1^1 > X \ n Z n ■ ■ ■ \%Z2 \Z\ 

Backward generalized composition <Bn: 
y \n z n ' ' ' I2Z2 \\Z\ x\y -> x \ n z n ■ ■ ■ \ 2 z 2 \izi 

Instances with n — are called application rules, and 
instances with n > 1 are called composition rules. In 
a given rule, x, y, Zi . . . z n would be instantiated as 
categories like NP, S/NP, or S\NP\ (S\NP) /N. Each of 
|i through „ would be instantiated as either / or \. 

A fixed CCG grammar need not include every 
phrase-structure rule matching these templates. In- 
deed, (Joshi et al., 1991) place certain restrictions 
on the rule set of a CCG grammar, including a re- 
quirement that the rule degree n is bounded over the 
set. The results of the present paper apply to such 
restricted grammars and also more generally, to any 
CCG-style grammar with a decidable rule set. 

Even as restricted by (Joshi et al., 1991), CCGs 
have the "mildly context-sensitive" expressive power 
of Tree Adjoining Grammars (TAGs). Most work 
on spurious ambiguity has focused on categorial for- 
malisms with substantially less power. (Hepple, 
1990) and (Hendriks, 1993), the most rigorous pieces 
of work, each establish a normal form for the syn- 
tactic calculus of (Lambek, 1958), which is weakly 
context-free. (Konig, 1989; Moortgat, 1990) have 
also studied the Lambek calculus case. (Hepple & 
Morrill, 1989), who introduced the idea of normal- 
form parsing, consider only a small CCG frag- 
ment that lacks backward or order-changing com- 
position; (Niv, 1994) extends this result but does 
not show completeness. (Wittenburg, 1987) assumes 
a CCG fragment lacking order-changing or higher- 
order composition; furthermore, his revision of the 
combinators creates new, conjoinable constituents 
that conventional CCG rejects. (Bouma, 1989) pro- 
poses to replace composition with a new combina- 
tor, but the resulting product-grammar scheme as- 

2 This formalization sweeps any type-raising into the 
lexicon, as has been proposed on linguistic grounds 
(Dowty, 1988; Steedman, 1991, and others). It also 
treats conjunction lexically, by giving "and" the gener- 
alized category x\x/x and barring it from composition. 



signs different types to "John likes" and "Mary pre- 
tends to like," thus losing the ability to conjoin such 
constituents or subcategorize for them as a class. 
(Pareschi & Steedman, 1987) do tackle the CCG 
case, but (Hepple, 1987) shows their algorithm to 
be incomplete. 

3 Overview of the Parsing Strategy 

As is well known, general CFG parsing methods 
can be applied directly to CCG. Any sort of chart 
parser or non-deterministic shift-reduce parser will 
do. Such a parser repeatedly decides whether two 
adjacent constituents, such as S/NP and NP/N, should 
be combined into a larger constituent such as S/N. 
The role of the grammar is to state which combi- 
nations are allowed. The key to efficiency, we will 
see, is for the parser to be less permissive than the 
grammar — for it to say "no, redundant" in some 
cases where the grammar says "yes, grammatical." 

(||) shows the constituents that untrammeled 
CCG will find in the course of parsing "John likes 
Mary." The spurious ambiguity problem is not that 
the grammar allow s (pa ), but that the grammar al- 
lows both (|5f|) and (|5g|) — distinct parses of the same 
string, with the same meaning. 

(5) a. [John] s/(s \ NP) 

b. [likes] ( S \np)/np 

c. [John likes]g/ NP 

d. [Mary] NP 

e. [likes Mary] s \ N p 

f. [[John likes] Mary] s < — to be disallowed 

g. [John [likes Mary]] s 

The proposal is to construct all constituents 
shown in (g) except for (j5|). If wc slightly con- 
strain the use of the grammar rules, the parser will 
still produce ( |5c[ ) and ( |5dj ) — constituents that are 
indispensable in contexts like (|l|) — while refusing to 
combine those constituents into (pq). The relevant 
rule S/NP NP — > S will actually beblocked when it 
attempts to construct (|fj). Although rule-blocking 
may eliminate an analysis of the sentence, as it does 
here, a semantically equivalent analysis such as ( pgl ) 
will always be derivable along some other route. 

In general, our goal is to discover exactly one anal- 
ysis for each <substring, meaning> pair. By prac- 
ticing "birth control" for each bottom-up generation 
of constituents in this way, we avoid a population 
explosion of parsing options. "John likes Mary" has 
only one reading semantically, so just one of its anal- 
yses (Pt)-(pg|) is discovered while parsing @. Only 
that analysis, and not the other, is allowed to con- 
tinue on and be built into the final parse of (|J). 

(6) that galoot in the corner that thinks [John 
likes Mary] s 

For a chart parser, where each chart cell stores the 
analyses of some substring, this strategy says that 



all analyses in a cell are to be semantically distinct. 
(Karttunen, 1986) suggests enforcing that property 
directly — by comparing each new analysis semanti- 
cally with existing analyses in the cell, and refus- 
ing to add it if redundant — but (Hepple & Morrill, 
1989) ©bserve briefly that this is inefficient for large 
charts.13 The following sections show how to obtain 
effectively the same result without doing any seman- 
tic interpretation or comparison at all. 

4 A Normal Form for "Pure" CCG 

It is convenient to begin with a special case. Sup- 
pose the CCG grammar includes not some but all 
instances of the binary rule templates in (||). (As 
always, a separate lexicon specifies the possible cat- 
egories of each word.) If we group a sentence's parses 
into semantic equivalence classes, it always turns out 
that exactly one parse in each class satisfies the fol- 
lowing simple declarative constraints: 

(7) a. No constituent produced by >Bn, any 
n > 1, ever serves as the primary (left) 
argument to >Bn', any n' > 0. 
b. No constituent produced by <Bn, any 
n > 1, ever serves as the primary (right) 
argument to <Bn' , any n' > 0. 

The notation here is from (||). More colloquially, 
(0) says that the output of rightward (leftward) com- 
position may not compose or apply over anything to 
its right (left). A parse tree or subtree that satisfies 
(Q) is said to be in normal form (NF). 

As an example, consider the effect of these restric- 
tions on the simple sentence "John likes Mary." Ig- 
noring the tags -OT, -FC, and -BC for the moment, 
( |Sa| ) is a normal- form parse. Its com pet itor (8b) is 
not, nor is any larger tree containing (pq). But non- 



3 How inefficient? (i) has exponentially many seman- 
tically distinct parses: n = 10 yields 82,756,612 parses 

48,620 equivalence classes. Karttunen's 



(S) 



method must therefore add 48,620 representative parses 
to the appropriate chart cell, first comparing each one 
against all the previously added parses — of which there 
are 48,620/2 on average — to ensure it is not semantically 
redundant. (Additional comparisons are needed to reject 
parses other than the lucky 48,620.) Adding a parse can 
therefore take exponential time. 



(i) ... S/S S/S S/S S S\S S\S S\S ... 

Structure sharing does not appear to help: parses that 
are grouped in a parse forest have only their syntactic 
category in common, not their meaning. Karttunen's ap- 
proach must tease such parses apart and compare their 
various meanings individually against each new candi- 
date. By contrast, the method proposed below is purely 
syntactic — just like any "ordinary" parser — so it never 
needs to unpack a subforest, and can run in polynomial 
time. 



standard constituents are allowed when necessary: 
( |Sc| ) is in normal form (cf. (|l|)). 



(8) 



S-OT 




S/(S\NP)-OT 
I 

John 



S\NP-OT 



(S\NP)/NP-OT NP-OT 

I I 

likes Mary 



forward application blocked by fl7q) 
(equivalently, not permitted by lufM)) 




S/(S\NP)-OT (S\NP) /NP-OT 

I I 

John likes 

N\N-OT 



whom 




(N\N)/(S/NP)-OT 



S/(S\NP)-OT (S\NP) /NP-OT 



John 



likes 



It is not hard to see that ( |7a| ) eliminates all but 
right-branching parses of "forward chains" like A/B 
B/C C or A/B/C C/D D/E/F/G G/H, and that @ 
eliminates all but left-branching parses of "backward 
chains." (Thus every functor will get its arguments, 
if possible, before it becomes an argument itself.) 
But it is hardly obvious that ([?]) eliminates all of 
CCG's spurious ambiguity. One might worry about 
unexpected interactions involving crossing compo- 
sition rules like A/B B\C — > A\C. Significantly, it 
turns out that (^) really does suffice; the proof is 
in $i~2. 



It is trivial to modify any sort of CCG parser 
to find only the normal-form parses. No seman- 
tics is necessary; simply block any rule use that 
would violate <Q). In general, detecting violations 
will not hurt performance by more than a constant 
factor. Indeed, one might implement (0) by modi- 
fying CCG's phrase-structure grammar. Each ordi- 
nary CCG category is split into three categories that 
bear the respective tags from (||). The 24 templates 
schematized in ( |Io| ) replace the two templates of (^) . 
Any CFG-style method can still parse the resulting 
spuriosity-free grammar, with tagged parses as in 
(0). In particular, the polynomial-time, polynomial- 
space CCG chart parser of (Vijay-Shanker & Weir, 
1993) can be trivially adapted to respect the con- 
straints by tagging chart entries. 



(9) -FC output of >Bn, some n > 1 (a forward composition rule) 
-BC output of <Bn, some n > 1 (a backward composition rule) 
-OT output of >B0 or <B0 (an application rule), or lexical item 

■ y-FC ) 



(10) a. Forward application >B0: 



x/y-BC 
x/y-OT 



r y-FC ' 

b. Backward application <B0: < y— BC 

I y-OT . 



y-BC > 

. y-oT J 

x\y-FC 
x\y-OT 



x-OT 



X-OT 



c. Fwd. composition >Bn (n > 1): 

d. Bwd. composition <Bn (n > 1): 



(11) a. Syn/sem for >Bn {n > 0): x/y 

t 
f 

b. Syn/sem for <Bn (n > 0): y \ n z n 



x/y-BC 
x/y-OT 

V \nZ n ■ ■ ■ \ 2 Z 2 |lZl-FC ' 
2/ jn^n • • • \ 2 Z 2 jl^l-BC 
y \nZ n ■ ■ ■ \ 2 Z 2 jl^l-OT 

V \nZ n ■ ■ ■ \ 2 Z 2 \lZl ~ 
t 
fj 

■ ■ ■ \ 2 z 2 \iZ\ x\y - 
t t 
9 f 



\izi- 


-FC 


\izi- 


-BC 


\izi- 


-OT 


x\y- 


-FC 


x\y- 


-OT 



\ 2 Z 2 |lZi-FC 



\ 2 Z 2 |iZi-BC 



X \ n Z n ■ ■ ■ \ 2 Z 2 \lZ X 
t 

\ci\c 2 . . . Xc n .f{g{c 1 ){c 2 ) ■ ■ ■ (c„)) 



X \riZ r{ 



t 



\ 2 z 2 \ \Z\ 



Xci\c 2 . . . Xc n .f(g(ci)(c 2 ) ■ ■ ■ (c„)) 



(12) a. 



A/C/F 



A/C/D D/F 
A/B^B/C/D d/fT^/f 




A/B B/C/D 



\x\y.f(g(h(k(x)))(y)) 
A/C/F 



A/B B/C/D D/E E/F 

/ 9 h k 



It is interesting to note a rough resemblance be- 
tween the tagged version of CCG in (]l(]) and the 
tagged Lambek calculus L*, which (Hendriks, 1993) 
developed to eliminate spurious ambiguity from the 
Lambek calculus L. Although differences between 
CCG and L mean that the details are quite different, 
each system works by marking the output of certain 
rules, to prevent such output from serving as input 
to certain other rules. 

4.1 Semantic equivalence 

We wish to establish that each semantic equivalence 
class contains exactly one NF parse. But what does 
"semantically equivalent" mean? Let us adopt a 
standard model-theoretic view. 

For each leaf (i.e., lexeme) of a given syntax tree, 
the lexicon specifies a lexical interpretation from the 
model. CCG then provides a derived interpretation 
in the model for the complete tree. The standard 
CCG theory builds the semantics compositionally, 
guided by the syntax, according to (pd|). We may 
therefore regard a syntax tree as a static "recipe" for 
combining word meanings into a phrase meaning. 



One might choose to say that two parses are se- 
mantically equivalent iff they derive the same phrase 
meaning. However, such a definition would make 
spurious ambiguity sensitive to the fine-grained se- 
mantics of the lexicon. Are the two analyses of 
VP/VP VP VP\VP semantically equivalent? If the 
lexemes involved are "softly knock twice," then yes, 
as softly (twice(knock)) and twice(softly(knock)) ar- 
guably denote a common function in the semantic 
model. Yet for "intentionally knock twice" this is 
not the case: these adverbs do not commute, and 
the semantics are distinct. 

It would be difficult to make such subtle distinc- 
tions rapidly. Let us instead use a narrower, "inten- 
sional" definition of spurious ambiguity. The trees in 
(|i"2|a-b) will be considered equivalent because they 
specify the same "recipe," shown in (|l2|c). No mat- 
ter what lexical interpretations /, g, h, k are fed into 
the leaves A/B, B/C/D, D/E, E/F, both the trees end 
up with the same derived interpretation, namely a 
model element that can be determined from /, g, h, k 
by calculating XxXy.f(g{h(k(x)))(y)). 

By contrast, the two readings of "softly knock 



twice" are considered to be distinct, since the parses 
specify different recipes. That is, given a suitably 
free choice of meanings for the words, the two parses 
can be made to pick out two different VP-type func- 
tions in the model. The parser is therefore conser- 
vative and keeps both parsesJj 

4.2 Normal- form parsing is safe &; complete 

The motivation for producing only NF parses (as 
defined by (0)) lies in the following existence and 
uniqueness theorems for CCG. 

Theorem 1 Assuming "pure CCG, " where all pos- 
sible rules are in the grammar, any parse tree a is se- 
mantically equivalent to some NF parse tree NF(a). 

(This says the NF parser is safe for pure CCG: we 
will not lose any readings by generating just normal 
forms.) 

Theorem 2 Given distinct NF trees a ^ a' (on the 
same sequence of leaves). Then a and a' are not 
semantically equivalent. 

(This says that the NF parser is complete: generat- 
ing only normal forms eliminates all spurious ambi- 
guity.) 

Detailed proofs of these theorems are available on 
the cmp-lg. archive, but can only be sketched here. 
Theorem |l| is proved by a constructive induction on 
the order of a, given below and illustrated in ( |l3|) : 

• For a a leaf, put NF(a) = a. 

• (<R, f3,y> denotes the parse tree formed by com- 
bining subtrees /3, 7 via rule R.) 

If a = <R,0,j>, then take NF(a) = 
<R, NF(0),NF (7)>, which exists by inductive 
hypothesis, unless this is not an NF tree. In 
the latter case, WLOG, R is a forward rule and 
NF(j3) = <Q, 0i, 02> for some forward com- 
position rule Q. Pure CCG turns out to pro- 
vide forward rules S and T such that a' — 
<S, 0i, NF (<T, [3 2 , 7>)> is a constituent and 
is semantically equivalent to a. Moreover, since 
01 serves as the primary subtree of the NF tree 
NF(0), 0i cannot be the output of forward com- 
position, and is NF besides. Therefore a' is NF: 
take NF{a) = a'. 

(13) If NF(0) not output of fwd. composition, 

def 

= NF(a) 
7 NF{0) NF(<y) 

else a — ==>• 

7 NF(0) 7 




4 (Hepple & Morrill, 1989; Hepple, 1990; Hendriks, 
1993) appear to share this view of semantic equivalence. 
Unlike (Karttunen, 1986), they try to eliminate only 
parses whose denotations (or at least A-terms) are sys- 
tematically equivalent, not parses that happen to have 
the same denotation through an accident of the lexicon. 




def 



NF(a) 



01 



This construction resembles a well-known normal- 
form reduction procedure that (Hepple & Morrill, 
1989) propose (without proving completeness) for a 
small fragment of CCG. 

The proof of theorem || (completeness) is longer 
and more subtle. First it shows, by a simple induc- 
tion, that since a and a' disagree they must disagree 
in at least one of these ways: 

(a) There are trees 0, 7 and rules R ^ R' such that 
<R, 0, 7> is a subtree of a and <R', 0, 7> is a 
subtree of a' . (For example, S/S S\S may form 
a constituent by cither <Blx or >Blx.) 

(b) There is a tree 7 that appears as a subtree of 
both a and a', but combines to the left in one 
case and to the right in the other. 

Either condition, the proof shows, leads to different 
"immediate scope" relations in the full trees a and a' 
(in the sense in which / takes immediate scope over 
9 in f(g(x)) but not in f(h(g(x))) or g(f(x))). Con- 
dition (a) is straightforward. Condition (b) splits 
into a case where 7 serves as a secondary argument 
inside both a and a' , and a case where it is a primary 
argument in a or a' . The latter case requires consid- 
eration of 7's ancestors; the NF properties crucially 
rule out counterexamples here. 

The notion of scope is relevant because semantic 
interpretations for CCG constituents can be written 
as restricted lambda terms, in such a way that con- 
stituents having distinct terms must have different 
interpretations in the mo del ( for suitable interpreta- 
tions of the words, as in §4.1). Theorem is proved 
by showing that the terms for a and a' differ some- 
where, so correspond to different semantic recipes. 

Similar theorems for the Lambek calculus were 
previously shown by (Hepple, 1990; Hendriks, 1993). 
The present proofs for CCG establish a result that 
has long been suspected: the spurious ambiguity 
problem is not actually very widespread in CCG. 
Theorem || says all cases of spurious ambiguity 
can be eliminated through the construction given 
in theorem |l| But that construction merely en- 
sures a right-branching structure for "forward con- 
stituent chains" (such as A/B B/C C or A/B/C C/D 
D/E/F/G G/H), and a left-branching structure for 
backward constituent chains. So these familiar 
chains are the only source of spurious ambiguity in 
CCG. 

5 Extending the Approach to 
"Restricted" CCG 

The "pure" CCG of §|is a fiction. Real CCG gram- 
mars can and do choose a subset of the possible rules. 



For instance, to rule out (|i~4|), the (crossing) back- 
ward rule N/N N\N — > N/N must be omitted from 
English grammar. 

(14) [the NP/N [[big N/N [that likes John] N \ N ] N/N 
galoot N ] N ] NP 



If 



rules 



removed from a "pure" CCG 
grammar, some parses will become unavailable. 
Theorem remains true (< 1 NF per reading). 
Whether theorem [l] (> 1 NF per reading) remains 
true depends on what set of rules is removed. For 
most linguistically reasonable choices, the proof of 
theorem [H will go throughjj so that the normal-form 
parser of §0 remains safe. But imagine removing 
only the rule B/C C — > B: this leaves the string A/B 
B/C C with a left-branching parse that has no (legal) 
NF equivalent. 

In the sort of restricted grammar where theorem [j] 
does not obtain, can we still find one (possibly non- 
NF) parse per equivalence class? Yes: a different 
kind of efficient parser can be built for this case. 

Since the new parser must be able to generate a 
non-NF parse when no equivalent NF parse is avail- 
able, its method of controlling spurious ambiguity 
cannot be to enforce the constraints (0). The old 
parser refused to build non-NF constituents; the new 
parser will refuse to build constituents that are se- 
mantically equivalent to already-built constituents. 

This idea originates with (Karttunen, 1986). 
However, we can take advantage of the core result 
of this paper, theorems |l| and p| to do Karttunen's 
redundancy check in O(l) time — no worse than the 
normal-form parser's check for -FC and -BC tags. 
(Karttunen's version takes worst-case exponential 
time for each redundancy check: see footnote §[|) 

The insight is that theorems |] and || estab- 
lish a one-to-one map between semantic equivalence 
classes and normal forms of the pure (unrestricted) 
CCG: 

(15) Two parses a, a' of the pure CCG are 
semantically equivalent iff they have the 
same normal form: NF(a) — NF(a'). 

The NF function is defined recursively by §fO's 
proof of theorem [I]; semantic equivalence is also 
defined independently of the grammar. So ( J15| ) is 
meaningful and true even if a, a' are produced by 
a restricted CCG. The tree NF(a) may not be a 
legal parse under the restricted grammar. How- 
ever, it is still a perfectly good data structure that 
can be maintained outside the parse chart, to serve 



For the proof to work, the rules S and T must be 
available in the restricted grammar, given that R and Q 
are. This is usually true: since (hj) favors standard con- 
stituents and prefers application to composition, most 
grammars will not block the NF derivation while allow- 
ing a non-NF one. (On the other hand, the NF parse of 
A/B B/C C/D/E uses >B2 twice, while the non-NF parse 
gets by with >B2 and >B1.) 



as a magnet for a's semantic class. The proof of 
theorem n] (see (13h) actually shows how to con- 
struct NF[a) in 0(1) time from the values of NF on 
smaller constituents. Hence, an appropriate parser 
can compute and cache the NF of each parse in O(l) 
time as it is added to the chart. It can detect redun- 
dant parses by noting (via an 0(1) array lookup) 
that their NFs have been previously computed. 

Figure ([!]) gives an efficient CKY-style algorithm 
based on this insight. (Parsing strategies besides 
CKY would also work, in particular (Vijay-Shanker 
& Weir, 1993).) The management of cached NFs in 
steps ^, [0| and especially [l6| ensures that duplicate 
NFs never enter the oldNFs array: thus any alter- 
native copy of a.n/has the same array coordinates 
used for a. nf itself, because it was built from identi- 
cal subtrees. 



The function Pref erableTo(cr, r) (step 15) pro- 
vides flexibility about which parse represents its 
class. PreferableTo may be defined at whim to 
choose the parse discovered first, the more left- 
branching parse, or the parse with fewer non- 
standard constituents. Alternatively, PreferableTo 
may call an intonation or discourse module to pick 
the parse that better reflects the topic-focus divi- 
sion of the sentence. (A variant algorithm ignores 
PreferableTo and constructs one parse forest per 
reading. Each forest can later be unpacked into in- 
dividual equivalent parse trees, if desired.) 

(Vijay-Shanker & Weir, 1990) also give a method 
for removing "one well-known sou rce" of spurious 
ambiguity from restricted CCGs; § 4.2 
that this is in fact the only source 



above shows 
However, their 



method relies on the grammaticality of certain inter- 
mediate forms, and so can fail if the CCG rules can 
be arbitrarily restricted. In addition, their method 
is less efficient than the present one: it considers 
parses in pairs, not singly and does not remove any 
parse until the entire parse forest has been built. 

6 Extensions to the CCG Formalism 

In addition to the Bn ("generalized composition") 
rules given in §|, which give CCG power equivalent 
to TAG, rules based on the S ("substitution") and 
T ( "type-raising" ) combinators can be linguistically 
useful. S provides another rule template, used in 
the analysis of parasitic gaps (Steedman, 1987; Sz- 
abolcsi, 1989): 



(16) a. >S: 



b. <S: 



x/y\iz 

I 
f 

y\iz 



y\iz 
t 

9 

x\v\iz 



x \\Z 

t 

Xz.f(z)(g(z)) 



Although S interacts with Brt to produce another 
source of spurious ambiguity, illustrated in ([Tt]) , the 
additional ambiguity is not hard to remove. It can 
be shown that when the restriction ( |l8| ) is used to- 
gether with (0), the system again finds exactly one 



1. for i := 1 to n 

2. C[i — 1, i] := LexCats(word[i]) (* word i stretches from point i — 1 to point i *) 

3. for width := 2 to n 

4. for siari := to n — width 

5. end := s£art + width 

6. for mid := siari + 1 to end — 1 

7. for each parse tree a = <R, (3, -f> that could be formed by combining some 

13 <E C[start, mid) with some 7 € C[nii<i, end] by a rule i? of the (restricted) grammar 

8. a.nf := NF(a) (* can be computed in constant time using the .nf fields of (3, 7, and 

other constituents already in C. Subtrees are also NF trees. *) 

9. existingNF := oldNFs[a.nf .rule, a.nf .leftchild.seqno, a.nf .rightchild.seqno] 

10. if undefined (existingNF) (* the first parse with this NF *) 

11. a.nf.seqno :— (counter := counter + 1) (* number the new NF & add it to oldNFs *) 

12. oldNFs[a.nf .rule, a.nf .leftchild.seqno, a.nf .rightchild.seqno] := a.nf 

13. add a to C [start, end] 

14. a.nf.currparse := a 

15. elsif Pref erableTo(a, existingNF .cur r parse) (* replace reigning parse? *) 

16. a.nf :— existingNF (* use cached copy of NF, not new one *) 

17. remove a.nf.currparse from C[start, end] 

18. a,dd a to C [start, end] 

19. a.nf.currparse := a 

20. return(all parses from C[0, n] having root category S) 



Figure 1: Canonicalizing CCG parser that handles arbitrary restrictions on the rule set. (In practice, a 
simpler normal- form parser will suffice for most grammars.) 



parse from every equivalence class. 
(17) a. VPq/NP (<Bx) 




VP 2 /NP VPi\VP 2 /NP 
filed [without-reading] 

b. VP /NP (<Sx) 




VP 2 /NP VP \VP 2 /NP (<B2) 




VPi\VP 2 /NP VP \VPi 
(18) a. No constituent produced by >Bn, any 
n > 2, ever serves as the primary (left) 
argument to >S. 
b. No constituent produced by <Bn, any 
n > 2, ever serves as the primary (right) 
argument to <S. 

Type-raising presents a greater problem. Vari- 
ous new spurious ambiguities arise if it is permit- 
ted freely in the grammar. In principle one could 
proceed without grammatical type-raising: (Dowty, 
1988; Steedman, 1991) have argued on linguistic 
grounds that type-raising should be treated as a 
mere lexical redundancy property. That is, when- 
ever the lexicon contains an entry of a certain cate- 



gory X, with semantics x, it also contains one with 
(say) category T/(T\X) and interpretation Xp.p(x). 
As one might expect, this move only sweeps the 
problem under the rug. If type-raising is lexical, 
then the definitions of this paper do not recognize 
( |l9| ) as a spurious ambiguity, because the two parses 
are now, technically speaking, analyses of different 
sentences. Nor do they recognize the redundancy in 
(p0|), because — just as for the example "softly knock 
twice" in §[0] — it is contingent on a kind of lexical 
coincidence, namely that a type-raised subject com- 
mutes with a (generically) type-raised object. Such 
ambiguities are left to future work. 

(19) [John NP left S \ NP ] s vs. [John s/(s \ NP) left s \ NP ] s 

(20) [S/(S\np s ) [S\NP S /Ne / NPl T\(T/np )]] s /Si 
vs. [S/(S\np s ) S\np s /npo/n Pi ] T\(T/np )] s/Si 

7 Conclusions 

The main contribution of this work has been formal: 
to establish a normal form for parses of "pure" Com- 
binatory Categorial Grammar. Given a sentence, 
every reading that is available to the grammar has 
exactly one normal- form parse, no matter how many 
parses it has in toto. 

A result worth remembering is that, although 
TAG-cquivalcnt CCG allows free interaction among 
forward, backward, and crossed composition rules of 
any degree, two simple constraints serve to eliminate 
all spurious ambiguity. It turns out that all spuri- 
ous ambiguity arises from associative "chains" such 
as A/B B/C C or A/B/C C/D D/E\F/G G/H. (Wit- 



tenburg, 1987; Hepple & Morrill, 1989) anticipate 
this result, at least for some fragments of CCG, but 
leave the proof to future work. 

These normal-form results for pure CCG lead di- 
rectly to useful parsers for real, restricted CCG 
grammars. Two parsing algorithms have been pre- 
sented for practical use. One algorithm finds only 
normal forms; this simply and safely eliminates spu- 
rious ambiguity under most real CCG grammars. 
The other, more complex algorithm solves the spu- 
rious ambiguity problem for any CCG grammar, by 
using normal forms as an efficient tool for grouping 
semantically equivalent parses. Both algorithms are 
safe, complete, and efficient. 

In closing, it should be repeated that the results 
provided are for the TAG- equivalent Bn (general- 
ized composition) formalism of (Joshi et al., 1991), 
optionally extended with the S (substitution) rules 
of (Szabolcsi, 1989). The technique eliminates all 
spurious ambiguities resulting from the interaction 
of these rules. Future work should continue by 
eliminating the spurious ambiguities that arise from 
grammatical or lexical type-raising. 



Rpfprpnrps 



Gosse tiouma. iy»9. Himcient processing ot flexible 
categorial grammar. In Proceedings of the Fourth 
Conference of the European Chapter of the Associ- 
ation for Computational Linguistics, 19-26, Uni- 
versity of Manchester, April. 

David Dowty. 1988. Type raising, functional com- 
position, and non-constituent conjunction. In R. 
Oehrle, E. Bach and D. Wheeler, editors, Catego- 
rial Grammars and Natural Language Structures. 
Reidel. 

Mark Hepple. 1987. Methods for parsing combina- 
tory categorial grammar and the spurious ambi- 
guity problem. Unpublished M.Sc. thesis, Centre 
for Cognitive Science, University of Edinburgh. 

Mark Hepple. 1990. The Grammar and Process- 
ing of Order and Dependency: A Categorial Ap- 
proach. Ph.D. thesis, University of Edinburgh. 

Mark Hepple and Glyn Morrill. 1989. Parsing and 
derivational equivalence. In Proceedings of the 
Fourth Conference of the European Chapter of the 
Association for Computational Linguistics, 10-18, 
University of Manchester, April. 

Herman Hendriks. 1993. Studied Flexibility: Cate- 
gories and Types in Syntax and Semantics. Ph.D. 
thesis, Institute for Logic, Language, and Compu- 
tation, University of Amsterdam. 

Aravind Joshi, K. Vij ay- S hanker, and David Weir. 
1991. The convergence of mildly context-sensitive 
grammar formalisms. In Foundational Lssues in 
Natural Language Processing, MIT Press. 



Lauri Karttunen. 1986. Radical lexicalism. Report 
No. CSLI-86-68, CSLI, Stanford University. 

E. Konig. 1989. Parsing as natural deduction. In 
Proceedings of the 27th Annual Meeting of the As- 
sociation for Computational Linguistics, Vancou- 
ver. 

J. Lambek. 1958. The mathematics of sen- 
tence structure. American Mathematical Monthly 
65:154-169. 

Michael Moortgat. 1990. Unambiguous proof repre- 
sentations for the Lambek Calculus. In Proceed- 
ings of the Seventh Amsterdam Colloquium. 

Michael Niv. 1994. A psycholinguistically motivated 
parser for CCG. In Proceedings of the 32nd An- 
nual Meeting of the ACL, Las Cruces, NM, June. 



cmp-lg/9406031 



Remo Pareschi and Mark Steedman. 1987. A lazy 
way to chart parse with combinatory grammars. 
In Proceedings of the 25th Annual Meeting of the 
Association for Computational Linguistics, Stan- 
ford University, July. 

Scott Prevost and Mark Steedman. 1994. Spec- 
ifying intonation from context for speech syn- 
thesis. Spe ech Communication, 15:139-153. cmp- 



lg/9407015 



Mark Steedman. 1990. Gapping as constituent coor- 
dination. Linguistics and Philosophy, 13:207-264. 

Mark Steedman. 1991. Structure and intonation. 
Language, 67:260-296. 

Mark Steedman. 1987. Combinatory grammars and 
parasitic gaps. Natural Language and Linguistic 
Theory, 5:403-439. 

Anna Szabolcsi. 1989. Bound variables in syntax: 
Are there any? In R. Bartsch, J. van Bcnthcm, 
and P. van Emde Boas (eds.), Semantics and Con- 
textual Expression, 295-318. Foris, Dordrecht. 

K. Vijay-Shanker and David Weir. 1990. Polyno- 
mial time parsing of combinatory categorial gram- 
mars. In Proceedings of the 28th Annual Meeting 
of the Association for Computational Linguistics. 

K. Vijay-Shanker and David Weir. 1993. Parsing 
some constrained grammar formalisms. Compu- 
tational Linguistics, 19(4):591-636. 

K. Vijay-Shanker and David Weir. 1994. The equiv- 
alence of four extensions of context-free gram- 
mars. Mathematical Systems Theory, 27:511-546. 

Kent Wittenburg. 1986. Natural Language Pars- 
ing with Combinatory Categorial Grammar in a 
Graph-Unification-Based Formalism. Ph.D. the- 
sis, University of Texas. 

Kent Wittenburg. 1987. Predictive combinators: A 
method for efficient parsing of Combinatory Cat- 
egorial Grammars. In Proceedings of the 25th An- 
nual Meeting of the ACL, Stanford Univ., July. 



